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DETECTING RELAYED COMMUNICATIONS 

FIELD OF THE INVENTION 

The present inversion relates to methods, apparatus and computer readable 
code for detecting relayed communications. 

5 BACKGROUND OF THE INVENTION 

Relay devices are commonly used In many communication mediums and 
environments, and especially on the Internet. A relay device Is a communication device 
that receives communications from a sender and forwards them to a receiver, 

A relay device may be used in cases where direct communication between the 
10 sender and receiver Is not possible, or to enhance the performance and security of 
various applications. 

For example, users in a secure environment (e.g. a private corporate data 
network) may be prohibited from connecting directly to HTTP servers (see RFC 2818; 
for Information about the RFC series of documents see the RFC Editor website at 
1 5 http:/^imw.rlc-edltor v org) on the public Internet. In such cases an HTTP proxy server 
may be Installed In the secure network, and will be allowed to connect to outside HTTP 
servers. Users can then use the proxy to relay HTTP requests and responses to and 
from external HTTP servers. In this example, the HTTP proxy server is a relay device, 
in another example, users on a small network (e,g« a home network) may use a SOCKS 
20 proxy (see RFC 1 928) to connect to the Internet from multiple personal computers using 
one Internet connection with a single IP address (see RFC 791). In this example, the 
SOCKS proxy is a relay device. In another example, seme HTTP proxies serve as 
cache proxies, by storing local copies of the content they receive and then serving 
requests for the same content from local storage. By doing that, cache proxies reduce 
25 the number of requests sent to remote servers. In another example, HTTP proxies 
serve as content filtering proxies, by denying users' access to objectionable materials, 

Besides these normal uses, relay devices are often exploited 
for malicious purposes. 

For example, a malicious user (afeckef) will use a relay device to hide his real IP 
30 address. IP addresses are often used to expose the identity of an attacker by examining 
Internet Service Provider (ISP) records to reveal who used the IP address at the time of 
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the attack. Since the attacked party 1 sees the oommootcations as originating from the 
relay device's IP address, the attacker remains .anonymous and Is less likely to suffer 
consequences (e.g. losing his ISP account or getting arrested}. This technique is often 
used by hackers, fraudsters and scammers. 
5 An attacker may also use several relay deuces at once by instructing one relay 

device to connect to another relay device and so on, and instructing the last relay 
device to connect to the target This protects the attacker In case the operator of the last 
relay device is asked to provide the IP address used in the attack. 

In another example, an attacker will use a large number of relay devices to 

10 create the illusion that communications are originating from many different users. 
Attackers use this technique to circumvent anti-abuse systems that block IP addresses 
based on the rate of potentially abusive actions they make (i.e. number of actions made 
in a time period). For example, many online services that use passwords to authenticate 
their users will bloc!* an IP address after a few failed login attempts, in order to prevent 

15 brute force attacks. In a brute force attack, an attacker attempts to recover a password 
by tying many different passwords until a successful login, in another example, many 
online services which provide access to a directory of personal Information will block an 
IP address if the rate of queries it sends exceeds a certain limit, in order to prevent 
attackers from harvesting large amounts of personal information, which can be used for 

20 other abusive actions such as sending spam (unsolicited electronic messages), in 
another example, anti-spam systems will block IP addresses that send a high volume of 
messages. In another example, since web sites can get paid for each time a user views 
an online advertisement (or click on if), online advertising companies will ignore large 
numbers of advertisement views (or clicks on advertisements) that originate from the 

25 same IP address, to prevent seammers from generating falsa views of (or clicks on) 
advertisements. 

By using multiple relay devices, scammars circumvent these defenses. 
In another example, an attacker will use a relay device to create the Illusion that 
lie Is located in a different geographical location. Since many online credit card fraud 
30 attempts originate from outside the United States, many US online merchants will not 
accept foreign credit cards or ship products abroad. Fraudsters can overcome these 
barriers by using US credit cards and shipping to accomplices in the US. Merchants 
responded by rejecting orders in which the geographic location of the IP address (as 
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reported by IP geo~location services such as GeoFoihi offered by Guova, Inc. of 
Mountain View, California, USA; Sea US Patarsfe 8,884,250 and 8,757,740), does not 
match the address or addresses provided in the order (e.g. the credit card billing 
address Is in the US, while the IP address is In Indonesia). Fraudsters overcome this 
5 barrier by using relay devices in acceptable locations. 

While properly configured relay devices usually implement access control 
mechanisms to allow access only to authorized users, many relay devices are globally 
accessible (known as 'open proxies') and are abused by attackers, in some cases, open 
proxies exist because they are shipped as part of a. hardware device or software and 

1 0 were unknowingly Installed by their owners, or because administrators have mistakenly 
or carelessly configured relay devices to relay communications torn unauthorized 
sources, in other cases the open proxy is maliciously installed without the permission of 
the computer owner, such as by sending a Trojan Horse' to the computer's owner, by a 
computer virus, or by manually hacking into the computer (hacking is the act of 

1 5 exploiting a malfunction or reconfiguration to gain control over the computer). 

Since relay devices, and especially globally accessible relay devices are often 
used for malicious purposes, many online service providers and merchants treat any 
communication received through a relay device as malicious. For example, many SMTP 
servers (see RFC 821) will not accept emails received through relay devices, many IRC 

20 servers (see RFC 2810) will not accept users connected through relay devices, and 
some Internet merchants will not accept orders received through relay devices. 

Current methods for determining whether a communication Is being relayed 
through a relay device are based on examining whether communications from the 
source IP address of the communication are typical to e relay device (assuming the 

25 relay device reports Its own source IF address in the relayed communication). 

One such method is examining whether an HTTP communication contains HTTP 
headers unique to relay devices. Examples of such headers include 'X-Forwarded-For 5 , 
XOtfginatiRgMP*, 'Via 1 , 'X-Caehe' and tjfenfHF, This method Is limited In that it cannot 
foe used when the relayed protocol Is not HTTP, it Is further limited in that not all relay 

30 devices report such headers, especially if relaying is performed at a level below HTTP, 
as Is the case with SOCKS proxies or when using the HTTP CONNECT method (see 
RFC 2817). 
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Another method is to afiempt to connect hack to the source IP address (create a 
'backward connection") using an agreed upoa protocol, which is not likely to be 
implemented by relay devices. For example, many IRC servers wll attempt to connect 
back to tie source IP address using tie Identification Protocol (see RFC 1413), which 
5 most IRC clients implement Since relay devices are not likely to implement the 
Identification Protocol, receiving an indication from the source IP address that the 
connection attempt was successful (e.g. a TCP segment containing the SYN and ACK 
control flags; for an explanation of TCP see RFC 793) would indicate that the 
communication is most likely not being relayed. This method Is limited in that service 
10 providers and users must agree on a protocol that would be used for backward 
connections, in that service providers must originate a connection to every user using 
the agreed upon protocol, and in that every user must operate a server to accept such 
connections. 

Another method involves creating a backward connection to the sconce IP 
15 address using protocols and port numbers commonly used for relay devices (e.g. 
SOCKS on TCP port 1080 or HTTP on TCP port 8080) and then attempting to relay a 
communication. Since most users do not operate globally accessible communication 
relays on their computers, a successful attempt would indicate that the user is most 
likely using a relay device. This method is limited in that service providers must originate 
20 backward connections to every user, and in that a multitude of backward connections 
are required to cover a significant portion of the relay devices configurations possible, 
This method is further limited in that creating multiple backward connections is a 
resource consuming operation, and may be regarded unethical, abusive or otherwise 
problematic. 

25 In an effort to alleviate the limitations of the current methods, online service 

providers cooperate with each other by sharing information about relay devices. For 
example, service providers often query databases (known as 'blacklists'} that list various 
communication parameters of globally accessible communication relays, as discovered 
by other service providers or by the database operators, for example to check if a given 

SO source IP address is listed. Such a database Is the PAPS Open Proxy Stopper 
maintained by Mali Abuse Prevention System LLC of San Jose, California, USA. These 
databases are as limited as the methods used to populate them, and are further limited 
by not being always up to date.. 



There is an apparent need for an effective method to determine whether a 
cooimunlcatlon is being relayed through a relay device. 

BRIEF SUMMARY OF THE INVENTION 

It is now disclosed for the first time a method for determining whether Information 
5 elements received from a potential relay device have been relayed through a relay 
device. The disclosed method of determining whether a potential relay device is a relay 
device includes receiving first and second information elements from the potential relay 
device, wherein the potential relay device Is an original source of the second information 
element. 

10 In some embodiments, the disclosed method further includes determining 

whether a feature of an original source of the first information element and a feature of 
the potential relay device are features unlikely to relate to a single device. In some 
embodiments, the disclosed method further includes determining whether a feature Of 
an original source of the first information element and a feature of the potential relay 

15 device are features unlikely to describe a single device, 

Several features of transmitters and original sources of information elements that 
are surprisingly useful for determining if a received information element has been 
relayed are disclosed herein. Features of transmitters and original sources of 
information elements useful for detecting if a received information element has been 

20 relayed include but are not limited to a configuration status of a device* communications 
performance of a device, a feature of a related DNS request and a latency parameter 
such as a round trip time to a transmitter and/or original source of information elements. 

According to some embodiments, the second information element is of a type 
that a relay device of a class of relay devices h unlikely to relay. 

25 According to some embodiments, the first Information element is of a type that a 

relay device of a class of relay devices is likely to relay. 

Exemplary classes of relay devices relevant for embodiments of the present 
Invention include, but are not limited to, SOCKS proxies, HTTP proxies including HTTP 
proxies using a GET method and/or a CONNECT method, IP routers and Network 

30 Address Translation devices. 

According to some embodiments, the first Information element and/or second 
information element are part of a communication of a type selected from the group 
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consisting of IP, TCP, ICMP, DNS, HTTP, SMTP, TLS, and SSL According to some 
embodiments, the first and second information elements are parts of a single 
communication. 

According to seme embodiments, the first and second information elements are 
§ sent in two different layers ef a protocol stack. 

According to some embodiments, the stage of determining Includes discovering 
the feature of an original source of the first information element, and discovering the 
feature of the potential relay device. 

According to some embodiments, the stage of determining further Includes 
10 comparing the feature of the original source of the first information element with the 
feature of the potential relay device. 

Thus, in one illustrative example, a configuration status parameter such as an 
operating system is determined both for an original source of the first Information 
element, and for the potential relay device, if a discrepancy is discovered between 
15 configuration status parameters of the original source of the lufomiaion packet and the 
potential relay device, this is unlikely to indicate a single device, and It is thus deduced 
that the potential relay device is not the same device as the original source device, but 
rather a separate relay device, 

In some embodiments, the method comprises obtaining a parameter indicative of 
20 the feature of an original source of the first information element, and obtaining a 
parameter indicative of the feature of the potential relay device. 

Thus, it is noted that it is not necessary to explicitly obtain knowledge of the 
features of the source of the first Information element and the source of the potential 
relay device. In a specific example, a differential latency between the potential relay 
26 device and the source of the first Information element is obtained, without necessarily 
obtaining the individual latencies, 

in some embodiments, the method includes obtaining a parameter indicative of a 
relationship between the feature of the original source of the first information element 
and the feature of the potential relay device. 
30 In some embodiments, the stage of determining includes analyzing the 

parameter indicative of a relationship between the feature of the original source of the 
first information element and the feature of the potential relay device 
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in soma embodiments, $m parameter h obtained from at least one of the first 
information element and the second Information element 

According to soma embodiments, the method further comprises sending an 
outgoing communication to at feast one of the original source of the first information 
5 element and the potential relay device, and receiving a third Information element, from at 
least one of the original source of the first infomiaHon element and the potential relay 
device. 

According to some embodiments, the method further Includes deriving from the 
third information element Information related to a feature of at least one of the original 
1 0 source of the first Information element and the potential relay device. 

According to some embodiments, the method further includes verifying that an 
original source of the third information element is the original source of the first 
information element 

According to some embodiments, the method further includes verifying that an 
15 original source of the third information element is the potential relay device. 

In one exemplary embodiment after receiving first and second information 
elements that may have been relayed, an HTTP response and a ping are returned to 
the purported source of the communication, irrespective of the presence of an 
intermediate relay device, the HTTP response is relayed by the relay device to the 
20 original source of the communications, which in turn, returns a third Information 
element, (n contrast, the relay device responds to the ping without forwarding the ping 
to the original source of the first Information element. Thus, wide differential In latencies 
is indicative of the presence of a relay device. 

According to some embodiments., the method further includes receiving a third 
25 Information element from the potential relay device, and deriving from the third 
information -element information related to a feature of the potential relay device. 

According to soma embodiments, the method further includes receiving a third 
Information element from the source of the first Information element and deriving from 
the third Information element information related to a feature of a source of the first 
30 Information element. 

According to some embodiments, at least one of the feature of a source of the 
first common loaf ion and the feature of the potential relay device is a feature related to a 
configuration status. 
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Exemplar/ features related to a coftfigyfation status include but are not limited to 
an operating system type, an operating ^tem version, a software type, an HTT P client 
type, an HTTP server type, an SMTP client type, an SMTP server type, a time setting, a 
clock setting and a time zone setting. 
5 According to some embodiments, the stage of "determining includes examining a 

parameter indicative of the feature related to a configuration status. 

Exemplary parameters Indicative of the feature related to a configuration status 
include but are not limited to HTTP 'User-Agon? header, An RFC 822 : X-Mailef header, 
An RFC 822 'Received 1 header, An RFC 822 'Date* Header, a Protocol implementation 

10 manner, a TCP/IP Stack Fingerprint, an IP address, a TCP port, a TCP Initial Sequence 
number, a TCP Initial Window, a Whois record, a Reverse DNS record, and a rate of" 
acknowledged information. 

According to some embodiments, at least one of the feature of a source of the 
first communication and the feature of the potential relay device is a feature related to 

15 communication performance. 

According to some embodiments, the feature related to communication 
performance is selected from the group consisting of a measured communication 
performance, a measured relative communication performance, and an estimated 
communication performance. 

20 According to some embodiments, the feature related to communication 

performance is selected from the group consisting of a latency of a communication, a 
latency of an incoming communication, a latency of an outgoing communication, a 
■communication rate, an incoming communication rate, an outgoing communication rate, 
incoming maximum communication rate, and an outgoing maximum communication 

25 rate. 

According to some embodiments, the stage of determining Includes examining a 
parameter indicative of the feature miated to communication performance. 

According to some embodiments, the parameter is selected from the group 
consisting of time of receipt of an information element, time of sending of an information 
30 element, a round fnp time, a roundtrip time gap, an IP address, a Whois record, a 
Reverse DNS record, and a rate of acknowledged information. 

According to some embodiments, a higher round trip time gap Is Indicative of a 
higher likelihood that a relay device is being used for malicious purposes. 
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According to some embodiments* .Mfeasf oh§< of the feature of a source of the 
first information element and the feature of the potential relay device is selected from 
the group consisting of a suivne^ork. a network administrator; and a geographic 
location. 

5 According to some embodiments, fee determining Includes examining a 

parameter Indicative of at least one of the feature of a source of the first communication 
and the feature of a source of the second communication, and the parameter is selected 
from the group consisting of an HTTP "User-Agenf header, an RFC 822 'X-Masfef 
header, an RFC 822 'Received 1 header, an RFC 822 'Date' Header, an IP address, a 

10 WHOIS record, and a reverse DNS record. 

it Is now disclosed for the first time a method of determining whether a potential 
relay device is a relay device. The disclosed method comprises receiving first and 
second Information elements from the potential relay device, wherein the potential relay 
device is an original source of the second Information element, and analyzing a 

15 configuration status of an original source of at least one of the first and the second 
information elements, wherein the configuration status is selected from the group 
consisting of an operating system type, an operating system version* a software type, 
an HTTP client type, an HTTP server type* an SMTP client type, an SMTP server type, 
a time setting, a clock setting, and a time zone setting. . 

20 If is now disclosed for the first time a method of determining whether a potential 

relay device Is a relay device. The disclosed method comprises receiving first and 
second Information elements from the potential relay device, wherein the potential relay 
device is an original source of the second information element, and analysing a feature 
related to communication performance of an original source of at least one of the first 

25 and the second information elements. 

According to some embodiments, the feature related to communication 
performance is selected from the group consisting of a latency of a communication, a 
latency of an incoming communication, a latency of an outgoing communication, a 
round trip time of a communication, a communication rats, an incoming communication 

30 rate, an outgoing communication rate, incoming maximum communication rate, and an 
outgoing maximum communication rate. 

It Is now disclosed for the first time a method of determining whether a potential 
relay device Is a relay device. The ; dl$^ios^l. metric^ Wpj&es sending a message to 
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the potential relay device inducing 9 final recipient of the message to send an outgoing 
DNS request, and determining from the outgoing DNS request whether the potential 
relay devise is a relay device.. 

It is now disclosed for the first time a method of determining whether a potential 
5 relay device is a relay device. The disclosed method comprises receiving first and 

second information elements from the pofentiai relay device, wherein the potential relay 
device is an original source of the second information element; and checking whether a 
round trip time to the pofentiai relay device is significantly different than a round-trip time 
to an original source of the first information element 

10 It Is now disclosed for the first time a method of determining whether a potential 

relay device Is a relay device. The disclosed method comprises receiving first and 
second information elements from the pofentiai relay device, wherein the potential relay 
device is an original source of the second information element; end checking whether 
an operating system of the potential relay device is different than an operating system of 

15 an original source of the first Information element. 

If is now disclosed for the first time a method of determining whether a potential 
relay device Is a relay device. The disclosed method comprises receiving first and 
second information elements from the potential relay device, wherein the potential play 
device is an original source of the second Information element, and checking whether a 

20 ioeatiorsof the potential relay device is different than a location of an original source of 
the first information element 

it is now disclosed for the first time a method of determining whether a potential 
relay device Is a relay device. The disclosed method comprises receiving first and 
second Information elements from the pofentiai relay device, wherein the pofentiai relay 

25 device is an original source of the second Information element; and checking whether 
an administrator of the potential relay device is different than an administrator of an 
original source of the first information element 

It is now disclosed for the first lime a method of determining whether a potential 
relay device is a relay device.. The disclosed method comprises determining whether a 

30 feature of an original source of a first InformiatJon element anda feature of the potential 
relay device are features unlikely to relate to a single device, wherein the potential relay 
device is a transmitter of the first infomiatiori element and of a second Information 
element, wherein the potential relay device is an original source of the second 
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information element, and wherein a positive result of the detemitning is indicative that 
the potential relay device Is a relay device.. 

It is now disclosed for the first time a method of determining whether received 
Information was relayed by a relay device. The disclosed method comprises 
5 determining from the received information element a communications performance 
measurement, and generating from results of the determining output Indicative of 
whether the received Information was relayed by the relay device. 

It is now disclosed for the first time a method of determining whether received 
information was relayed by a relay device. The disclosed method comprises 
10 determining from the received Information element a parameter indicative of 
communications performance, and generating from results of the determining output 
indicative of whether the received information was relayed by the relay device. 

Exemplary determined communication performance measurements Include but 
are hot limited to a latency of communication with a monitored host, an Incoming latency 
15 of communication with a monitored host, an outgoing latency of communication with a 
monitored host, a communication rate, an incoming communication rate, an outgoing 
communication rate, incoming maximum communication rate, and an outgoing 
maximum communication rate. 

It h now disclosed for the first time a system for determining whether a potential 
20 relay device is a relay device. According to some embodiments, the disclosed system 
includes an information element receiver, for receiving information elements from a 
plurality of devices Including an Information source device and the potential relay 
device, and a feature incompatibility analyzer, for determining whether a feature of the 
information source- device and a feature of the potential relay device are features 
26 unlikely to relate to a single device. 

According to some embodiments, the system further Includes a feature discovery 
module, for discovering at least one feature selected from the group consisting of a 
feature of the Information source device and a feature of the potential relay device. 

Optionally, the information element receiver is further configured to receive 
3D information elements from a moiiltored host. 

Optionally; the system Includes an outgoing Information element sender. 
According to some embodiments, the system further includes a parameter 
Ohtaioer, for obtaining at least one parameter selected from the group consisting of a 
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parameter indicative of a feature of an Mormafion source device, a parameter indicative 
of a feature of the potential relay device, and a parameter Indicative of whether a 
feature of the information source device and a feature of the potential relay device are 
features unlikely to relate to a single device. 
S According to some embodiments* the system further includes a feature database 

for storing a map between pairs of features and data indicative of whether the pairs of 
features are incompatible features. 

These and further embodiments wi be apparent from the detailed description 
and examples that follow. 

10 

BRIEF DESCRIPTION OF THE DRAWINGS 

In order to understand the Invention and to see how It may be carried out in 
practice, a preferred embodiment, will now be described, by way of namfejfing example 
only, with reference to the accompanying drawings, m which: 
15 Fig, 1 provides an illustration of an environment in which a Relay Detection 

System operates according to some embodiments of the present invention, 

Fig. 2A provides a diagram of a Potential Relay Device sending Information 
Elements to Relay Detection System. 

Fig, 28 provides a diagram of the original sources of information elements 
20 received by a Relay Detection System according to some embodiments of the present 
invention. 

Fig. 2C provides a diagram of the case wherein Potential Relay Device is 
Information Source Device. 

Fig, 2D provides a diagram of the case wherein Potential Relay Device and 
25 I nformation Source Device are dish net devices. 

Fig, 3 provides a description of a system according to several embodiments of 
the present invention. 

Fig. 4A describes the latencies of oof¥?mtinicatlons between Information Source 
Device and Monitored Host in the ease where Relay Device Is being used. 
SO Fig, 48- describes the latencies of coiiimunieatsons between Information Source 

Device and Monitored Host in the case where Relay Device Is not being used. 
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Fig. 5A provides a diagram of the ease In which Information Source Device 
sends two information elements, Potentially Relayed Information Element 1 and 
Potentially Relayed Information Element 2, directly to -Monitored Host, without using 
Relay Device. 

5 Fig. SB provides a diagram of the case In which two different devices, information 

Source Device and information Source Device 2 each send an information element to a 
Monitored Host, wherein both information elements are relayed by a Relay Device. 

DETAILED DESCRIPTION OF THE INVENTION 

Prior to setting forth the Invention, it may be helpful to an understanding thereof 
10 to first set forth definitions of certain terms that are used hereinafter. 

As used herein the term "communication" refers to the transfer of at least one 
information element between two devices. For example, an IP packet transferred over 
the Internet from one device to another device is a communication. In another example, 
an HTTP request transferred from an HTTP client to an HTTP server over the Internet is 
15 a communication. It should be noted that one or more communications could be 

transferred in one or more other communications. A group of communications in which 
one communication contains the other Is called a 'Protocol Stack', f or example, a 
communication in the HTTP protocol (Le, an HTTP request or response) is normally 
contained in a communication In the TCP protocol (ie. a TCP connection), which Is In 
20 turn contained in a communication in the IP protocol (ie. one or more IP datagrams). 

As used herein the term ''original source of an information etemenf refers to the 
device that sent that information element, but has not relayed that Information element 
from another device. 

As used herein the term 'transmitter 5 ' refers to a device that sent an Information 
25 element, including In cases where the infomlation element was previously relayed from 
another device, When an Information element is received from a device, that device is a 
transmitter of that Information element 

As used herein the term "feature" refers to any Information about a device that 
may be different between two different devices. 
30 Embodiments of the present invention recite the receiving and/or the sending of 

"information elements"" over a data network. In some embodiments, information 
elements are communicated using a communication protocol Exemplary 
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communication protocols for communicated Information elements over a data network 
include but are not limited to HTTP, SMTP, DNS (see RFC 103% SSL/TLS (see RFC 
2246), TCP, UDP (sea RFC 788), IP, and 1£MF (see RFC 792). 

Fig. 1 provides an Illustration of an environment In whfcN a Relay Detection 

5 System 1 02 operates according to some embodiments of the present invention. In some 
embodiments, Relay Detection System 102 (RDS) receives information elements 
communicated from an Information Source Device 104 (ISO) over the Internet 100. 

In some instances, in order to send information elements to a Monitored Host 
110 {UH), an ISD 104 first sends information elements to a Relay Device 108 (RD), 

10 which receives the Information elements and subsequently relays the received 
Information elements to the MH 110. For these instances, the relayed information 
elements travel from the information Source Device 104 to the Monitored Host 1 10 as 
denoted by path 122. 

Alternately, the ISD 104 does not use the Relay Device 108, and sends the 

1 5 information ©laments to the Monitored Host 1 1 0 without traversing the RD 108, as 
denoted by path 120. 

It is desired to ascertain whether or not information elements sent to the MH 110 
are sent through an RD 108. 

The present inventors have devised methods, apparatus and computer readable 

20 software for determining whether Information elements sent to MH 1 10 were sent via an 
RD 108. In some embodiments, the determining is performed by RDS 102, which 
monitors at least one communication received by the MH 1 10 and attempts to 
determine whether a communication is sent to the MH 1 10 using RD 108, As used 
herein the term 'monitor 1 refers to the act of receiving communications, including In 

25 cases where the communications were sent from or to the device that is performing the 
monitoring. 

Any class of relay devices is appropriate for the present Invention. Examples of 
classes of relay devices Include but are not limited to SOCKS Proxies (see RFC 1928), 
HTTP Proxies used with the GET method (see RFC 2618} , HTTP Proxies used with the 
30 CONNECT method (see RFC 2817), IP Routers (see RFC 1 8 12), and NAT devices 
(see RFC 2683), 

ISO 104 is a device configured to communicate v#h other devices over any data 
network, such as Internet 100. In some embodiments, ISD 104 is a device operated by 
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a person or group of persons. In some embodiments, ISD 104 Is a device operating 
automatically. In some embodiments, ISO 104 is a device operating automatically in a 
manner that simulates the actions of a person. Exemplary ISDs Include but are not 
limited to a computer running an HTML browser suph as Microsoft Internet Explorer (Tor 
5 an explanation of HTML see the HTML Specification m the W3C website at 

httpi/wwv^w3,org/TR/btml), a computer rynolng an IRC client, a computer running an 
SMTP client, and a cellular phone running an XHTML browser. MH 110 Is a device 
configured to communicate with other devices over any data network, such as Internet 
100, In one exemplary embodiment, Monitored Host 110 is a server. Examples of 
1 Q appropriate Monitored Hosts 1 10 include but are not limited to an HTTP server, a 

SSL/TIS server, an SMTP server, a fie server, a telnet server, an FTP server, an SSH 
server, a ONS server, and an IRC server. Examples of uses of MH 1 10 include but are 
not limited to hosting an online merchant, running an online advertising service, and 
receiving email, 

1 5 in some embodiments, RDS 1 02 monitors information elements sent from and/or 

to the MH 110. Optionally, RDS 102 Is further configured to communicate with the RD 
10S and/or the ISO 104. Optionally, the RDS 102 is further configured to monitor 
information elements sent from and/or to the RDS 102. 

Each of the RDS 102, ISD 104, RD 108, and MH 1 10 may be hardware, software 

20 or a combination thereof, may reside at the same or at different geographical locations, 
or may be components of the same device. 

For simplicity reasons, the presented environment contains one Information 
source device, and one relay device, in practice, there are many Information source 
devices connected to the Internet 100, and each of them may or may not use one of 

25 several relay devices that are also connected to the Internet 1 00. The goal of the 

present invention Is to differentiate between the general case of an Information source 
device that does not use a relay device, and the general case of an information source 
device that does use a relay device. It will be appreciated that the extrapolation from 
the presented environment to reaMie environments such as the Internet or other 

30 networks is well within the scope of the skilled artisan. 

Referring to Fig, 2A, It Is noted that according to some embodiments of the 
present invention, the RDS 102 receives Information elements sent to the MH 1 10, 
wherein a Potential Relay Device 150 (PHD) is a transmitter of the information 
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elements, but Is not necessarily the original source of each or any of the information 
elements.. According to some embodimente, the identity of Potential Relay Device 150 
(PRD) Is unknown and may be either RD 108, or ISO 1 04. 

Some embodiments of the present invention provide methods, apparatus and 
5 computer readable software for determining whether PRO 1 50 is RD 1 08 or ISO 1 04.. 
Fig, 28 provides a diagram of the original sources of information elements 
received by RDS 102 according to some embodiments of the present invention, A first 
information element and a second information element are received by RDS 102. 
According to some embodiments, PRD 150 Is the original source of the second 
1 0 information element, and thus the second Information element may also he referred to 
as a Non-Relayed information Element (NR1E). in contrast, ISO 104 is the original 
source of the first information element, and thus the first Information element may also 
be referred to as a Potentially Relayed Information Element (PRIE), Therefore, in the 
case where PRO 150 is 1SD 104 then PRO 160 is the original source of the PRIE, and 
15 in the case where PHD 1 50 is RD 1 08 then PRO 1 50 is not the original source of PRIE. 

in particular embodiments, the MRIE is of a type not relayed by a specific class of 
relay devices, thus ensuring that the second information element is not actually relayed 
fey a retay device of that class. For example, a standard HTTP Proxy (either using the 
COf^NEGT or the GET methods) does not relay \CMP messages. Therefore, if RD 108 
20 is a proxy of that type and RD 108 is a transmitter of an }CMP message, then the 1CMP 
message cm he assumed to be an NRIE. 

According to some embodiments of the present invention, it is determined 
whether a feature of 1SD 1 04 and a feature of PRD 1 50 are features thai are unlikely to 
relate to the same device (Incompatible Features), A feature Is said to .relate to a device 
25 If the feature is Information about that specific device. 

in some embodiments, the feature of ISO 104 is derived from the content of the 
PRIE. In alternate embodiments, the feature of ISO 104 is derived from other 
characteristics described below. In some embodiments, the feature of PRD 150 is 
derived from the content of the HRIE. In alternate embodiments, the feature of PRD 150 
50 is derived from ether characteristics described below. 

it is also noted that It Is not a requirement of the present invention to actually 
obtain either a feature of PR0 150 or a feature of ISO 104. In some embodiments 
detailed below, the RDS 102 can determine whether or not the features are 
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incompatible Features, without discovering each or any of ths teatures. The presence of 
Incompatible features Increases the livelihood that ISD 104- and PRO 150 are distinct 
devices (i.e. PRO 150 is RD 108), while the absence of Incompatible Features 
increases the likelihood that PRO 150 and ISD 104 are the same device (i.e. PRO 150 

5 is not RD 108), 

Fig. 2C provides a diagram of the case wherein PRO 150 is ISO 104. In this 
case. It Is detected that the original source of the first information element (ISO 104) Is 
the same device as the original source of the second information element (PRO ISO), 
and If may therefore be concluded that the Potential Relay Device 150 is the ISD 104 

1 0 and NOT a Relay Device 1 08, and it may also be concluded that PRIE was not relayed 
by the Relay Device 108. 

Fig. 2D provides a diagram of the alternate case wherein PRD 150 and ISO 104 
are distinct devices. In this case, It Is detected that the original source of the first 
Information element (ISD 104) is a different device than the original source of the 

15 second Information element {PRD 150). Thus, since PRIE has an original source that Is 
not PRO ISO, It Is concluded that PRIE has been relayed by PRO ISO, and that 
therefore PRO 150 is a Relay Device 100, 

As used herein, the term "feature" refers to at least one feature. Thus, it is 
disclosed that a combination of more than one feature Is defined as a feature In and of 

20 itself. 

As used herein, an 'ISP-Feature" is a feature of the original source of the first 
information element (I.e. the source of the PRIE, which is IS D 104), while a "PRO - 
Feature* is a feature of the source of the second information element (i.e. the source of 
the NRtE, which Is PRD 150), Examples of ISD-Features, FRD~Features, methods of 
25 discovering these features, and the way in which these features cart be used to 
determine whether ISO 104 is a different device than PRO 150 are given below. 

As stated above, one option for ensuring that the NRIE was not relayed by RD 
108 Is to select the NRIE to be of an information element type known not to be relayed 
by relay device of the specific class of relay devices to which RD 108 belongs. For 
30 example, if Is known that certain SOCKS proxies relay HTTP communications but do 
not relay TCP communications, if RD 108 is a SOCKS proxy it will maintain one TC P 
connection with ISD 104 and another separate TCP connection with MH 110, and will 
relay HTTP communications from one TCP connection to the other. Thus, In some 
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embodiments, the PRE is part of a possibly relayed HTTP commutation, and the 
NRIE is part of a non-relayed TCP communication,. 

In soma embodiments, fhe flrst and second Woomtion elements (NRIE and 
PR1E) are part of two different layers on a Protocol Stack of a single communication. 
5 Thus, in one particular case a single HTTP over TCP communication is sent, and the 
first non-relayed, information element is part of the TCP layer of the communication, 
while the second possibly relayed Information -fetement Is part of the HTTP layer of the 
communication. Alternately, the first and second elements are parts of two separate 
communications. 

10 Sometimes, however, the class of the Relay Device 1 03 is not necessarily known 

to the Relay Detection System 102. In some embodiments, the disclosed method 
explicitly requires an optional step of estimating or targeting a specific class of relay 
devices, and performing the method under the assumption that a potential class of relay 
devices is of that targeted class of relay devices. 

IS in some embodiments, methods of the present Invention are repeated for 

different pairs of MRIE and PRIE, wherein each pair Is instrumental in detecting at least 

one class of relay devices. 

in general, it Is disclosed that in some embodiments, a disclosed method may 
repeated sequentially or in parallel a number of times, wherein a final likelihood that a 
20 potential relay device is a relay device is derived from some sort of aggregate of results 
from the repeated methods. In some embodiments, obtaining an aggregate Includes 
obtaining an average, a weighted average, a minimum, maximum or any other method 
of charactering aggregate results known in the art. 

25 Using Communication Latency 

In one particular embodiment of the present invention, ISO-Feature Is chosen to 
be the latency of communications between IS0 104 and MH 110, and PRD-Feaf ure Is 
the latency of communications between PRO 150 and MH 1 10. Since the latency from 
one device to another device is usually relatively stable, then the same device is not 

30 likely to exhibit two significantly different latencies. Therefore, if the ISD 1 04 latency and 
the PHD 150 latency are significantly different, men it is relatively more likely that PRO 
150 and ISD 104 are distinctdeviees (la* PR0 150 is a Relay Device 108, and PBIE 
was being relayed). Similarly, if the ISD 104 latency and the FRO 150 latency are 
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similar; then if b relatively mof® likely that PRO 150 and ISO 104 are the same device 
(i.e. PRiE was not being relayed). 

As used herein the term latency' refers to the time delay between the sending of 
a communication and its receipt 
5 Latency occurs because of various reasons. One reason is the time required for 

electric signals or electromagnetic waves to travel the distance between two points on 
the path of the communication (e.g. electnc signals in an electric cable, light in an 
•optical fiber, or microwaves In a mierowave-iinfc). Another reason Is the time required for 
information to be transferred over a comrminication line (e.g. it would take 25 
1 0 milliseconds (ms) for 1 300 Bytes to be fully transferred' over a 480 kiloblts-per~seeond 
(kbps) communication line). Another mason is the store-and-forward method used in 
many data networks, wherein portions of the communications (e.g, packets or cells) are 
forwarded from one intermediate network element to the next, only after they have been 
received in Ml Another reason is the delaying of communications in buffers of switching 
15 elements in the network (e.g. when a communication toe Is in the process of 

transferring one communication, new communications are saved in a buffer until the line 
Is freed up). Another reason is the processing time of certain communications;, such as 
for making routing decisions, encryption, decryption, compression or decompression. 

Since a communication is sent from one device and received at a different 
20 device, It Is usually easier to measure RoundTrip Time (RTT> rather than latency. RTT 
is the time passed between the sending of an outgoing communication (OC) from M.H 
1 10 and the receipt of an incoming communication by MH 1 1 0 (IC), which was sent 
immediately uport receipt of OG. RTT is therefore the sum of the latency of the OC and 
the latency of the SC. For example, when using the ICMP echo mechanism (see RFC 
25 792), RTT Is the time passed between sending the Echo message and receiving the 
Echo Reply message. 

Fig. 4A describes the latencies of communications between ISD 104 and MH 110 
in the case where RD 108 is being used. 

T1 is the latency of communications from MH 110 to RD 108 
30 T2 is the latency of cornmunicatlQns fern RD 108 to ISD 104. 

T3 is the latency of communications from ISO 104 to RD 108. 

T4 is the latency of communications from RD 108 to MH 1 1 0, 



Fig. 48 describes the latencies of pom^ ISO 104 and MH 110 

In the case where RD 108 is not being used. 

IS is the latency of communications from MH 1 10 to ISO 1 04, 
IB is the latency of communications from iSO 1 04 to MH 1 1 0. 



communications between MH 1 10 and ISD 104. 

As used herein the term ' PRO RTF 8 refers to the round-trip time of 
communications between MH 110 and PRD 150, 

As used herein the term "RTT gap" refers to .the Terence between ISD RTT and 
10 PRD RTT. 

In the case where PRD 150 is RD 108 (i.e. PRIE is relayed) the round-trip time of 
ISD RTT should be longer than the PRD RTT, Specifically PRD RTT is equal to T1+T4 
(the RTT between MH 110 and RO 108), and ISO RTT is equal to T1*I2*T3*T4 (the 
RTT between MM 110 and RD 100 plus the RTT between RD 108 and ISD 104), The 
15 RTT gap equals to the RTT between PRD 150 (which Ss RD 1 08} and ISD 104, which m 
equal to T2*T3, 

However, in the case where PRD 150 is ISD 104 (i.e. PRi£ is not relayed) then 

ISD RTT and PRD RTT are both equal toT5+T6 (the RTT between MH 110 and ISO 

1 04), The RTT gap should therefore be close to zero. 
20 .For example, if ISD RTT Is 840 milliseconds and PRD RTT is 130 milliseconds, ft 

is more likely that PRD 150 is Relay Device 108 and that PR IE Is being relayed than If 

ISD RTT is 133 milliseconds and PRD RTT is 130 milliseconds. 

it should be noted that even in the case where PRD 1 50 is ISD 104 some 

differences between ISD RTT and PRD RTT might be found, due to different network 
25 delays that each communication was subject to. However, In most practical cases the 

RTT gap when using a relay device is noticeably larger than the RTT gap when not 

using a relay device. 

When measuring the RTT pap the RDS 1021s effectively performing two feature 

comparisons, The first is the comparison of the latency of a communication from ISO 
30 1 04 to MH 1 1 0 compared to the latency of a communication from PRD 1 50 to MH 1 1 0. 

The second Is the comparison of the latency of a communication from MH 1 10 to ISD 

1 04 1 1.0 compared to the latency of a communication from MH 1 1 0 to PRD 150, Both 

latencies should be larger in relayed communications compared to non-relayed 
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As used herein the term "ISD RTF refers to the round-trip time of 
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communications, meaning that the RTF gap, which Is the sum of two latency 
differences, should also be larger In relayed communications compared to non-relayed 
communications. 

In an exemplar/ embodiment RD 108 is a SOCKS proxy, ISD 104 is an HTTP 
5 client, and MH 110 is an HTTP server. In tMe case, RDS 102 obtains the ISO RTF by 
measuring the RTF of an HTTP communication |a RRiE}, and obtains the PRD PIT by 
measuring the RTT of a TCP communication (an HRtE). 

In another exemplary embodiment RO 108 is an HTTP proxy using the 
CONNECT method, ISD 104 is an SMTP client, and MH 110 Is an SMTP server. In this 
1 0 ease, RDS 1 02 obtains the ISO RTF by measuring the RTT of an SMTP communication 
fa PRE). RDS 102 then sends an ICMP Echo message to the source IP address of the 
TCP connection (an NRIE) in which the SMTP communication was received , RDS 102 
then receives an ICMP Echo Reply message. The PRO RTT Is the time between the 
two ICMP messages (as explained below the original source of the ICMP Echo Reply 
15 message Is PRD 150), 

Methods for measuring the RTT of various types of communications are 
described below. 

Tile RTT gap is also useful In differentiating between various uses of relay 
devices. For performance and security reasons, legitimate users normally use a near-by 

20 relay device (e.g. on the same corporate nehvork), resulting In a short RTT gap. On the 
other hand, malicious users often use a remote relay device (e.g. in another country), 
resulting in a long RTT gap. A long RTF gap is therefore Indicative of a relay device 
used for malicious purposes. This method is especially effective In cases where 
malicious users cannot avoid using a remote relay device. For example, an Indonesian 

25 fraudster wishing to appear as if he is located in the USA must use a remote relay 

device. In another example, it would be significantly more difficult for a spammer to use 
a large number of relays if ail of them must have short. RTT gaps. 

Measuring RTT 

30 Accurate RTT measurements require that IC be sent Immediately upon receipt of 

the OC. This may be achieved in several ways. 

Some protocol implementations provide immediate response on some 
communications. For example, in a TCP three-way handshake a segment containing 



22 

the SYN and ACK control flags (SYM-ACK segment) should he sent immediately upon 
receipt of the related segment cootamteg the SYN control flag (SYN segment) . The RTT 
in such a case is the time between sending of a SYN segment (DC) and receipt of a 
SYN - ACK segment (IC). 
5 in other oases, the protocol implementation might not provide an immediate 

response, but an application handling the communications could be expected to 
generate it For example, an HTTP client would normally generate an HTTP request 
Immediately upon receiving an HTTP '302* response, in another example, an HTML 
browser would normally generate an HTTP request Immediately upon receiving the first 

10 embedded image in an HTML page (e.g. using the HTML <img> tag). In another 

example, an S8L/TLS layer will send a CilentHelio message Immediately upon receiving 
the TCP SYN-ACK segment indicating the TCP connection was established. I n another 
example, an SMTP client would normally send a 'RCPT command immediately upon 
receiving a ^SO' response to a previous MAIL 1 command (see RFC 821), in another 

IS example, an IRC silent would normally send a 'PONG' massage immediately upon 
receiving a 'RING' message from an IRC server. 

In other cases, human interaction could be used to generate immediate 
responses.. For example, a user Is presented wfih a game in which he should press a 
keyboard key immediately upon seeing a signal on the screen. The signal is presented 

20 when DC is received, and iC is sent when the user responds. 

Several RTT measurements taken over a short period of time may produce 
different results. This Is due to variations in network congestions and other parameters. 
It Is therefore recommended to make several RTT measurements if possible. 
Furthermore, since the RTT cannot fall below the time If takes for the electric or electro- 

25 magnetic signal to travel the complete route, using the lowest of several RTT 
measurements will normally produce more accurate and reliable results.. 

If should he noted that a malevolent user may send an IC prior to receiving an 
OCX This will deceive RDS 102 to calculate a shorter RTT, it is therefore recommended 
to place some secret information In OC (secret Information is information that cannot be 

3D easily obtained by a person or device that does hot alraady know the secret), and 
require that IC contain tfie secmt Information (or a derivative of if). This prevents the 
sending of an IC before receiving the secret Information in OC< For example, the HTTP 
f 302 ! response described above may redirect the HTTP client to a URL that contains a 
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secret, The HTTP client would then generate an HTTP request to that URL, thereby 
sending the same secret back. In another example, a TCP SYN-ACK segment (an OC) 
contains a secret Initial Sequence Number, and a TCP segment containing the ACK 
control flag (TCP ACK segment; an SO) contains m Acknowledgement Number, which 
S fs a derivative of the Initial Sequence dumber, in another example, an \CMP Echo 
message (an OC) contains a secret Identifier, Sequence Number or Data, and an ICMP 
Echo Reply message (an 10} contains the same secret. 

it should be noted that when calculating an RTT RDS 102 must also monitor 
comoiunications sent by MH 1 10 (and not only communications received by MH 1 1 0^ In 

1 0 order to detect the time at which each OC Is sent However, thfe requirement may be 
bypassed if an OC to ISD 104 and an OC to PRO 150 are known to be sent at the same 
time, in such a case, the RTTs are unknown, but the RTT gap may still be calculated, 
and Is equal to the difference between the times of arrival of the respective iC's, For 
example, if PRE is SSbTLS and NRIE is TCP, the TCP SYN-ACK segment sent by 

1S? MH 110 Is such mO0 t and the RTT gap is the time between receipt of the 

corresponding: TCP ACK segment and receipt of the SSLTLS CljentHeijo message. 

If MH 110 does not send an OC, which can be used by RDS 102, ROS 1D2 may 
mod to send such an OC by itself. For example, RDS 102 is monitoring HTTP 
communlcatsons to and from a website 110). in order to measure the RTT of the 

20 PRIE, RDS 1 02 provides an HTTP '302 s response for some of the HTTP requests 
received by MH 110, as described above, in this example RDS 1.02 may need to be a 
software module installed on MH 110, so it could respond to HTTP communications 
sent to MH 110. 

In the embodiments involving measurements of latency or RTT It is 
25 recommended to check that ISD RTT Is 'significantly different' than PRD RTT. The 
RTTs are significantly different when the difference between them is one that rarely 
occurs when making RTT measurements on communications with the same device. At 
the time of writing of this text* the RTT between two devices in a reasonably stable 
Internet environment does not vary by more than SO milliseconds within a time frame of 
30 a few seconds, in contrast, the RTT gap In relayed communications usually exceeds 
100 milliseconds. Therefore, In some embodiments, 'significantly different" can be 
Interpreted to mean 'more than 60 milliseconds 1 . In some embodiments, 'significantly 
different' can be interpreted to mean 'more than 80 milliseconds 1 , in some 
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embodiments, 'significant^ different' can he interpreted to mean *more than 1 00 
rm!feeconds : . : and so forth, 

In some embodiments, the RTT to a device is not directly measured but rather 
estimated based on other information known about the device, such as its geographical 
S location, its IP address* reverse DNS record g\e, host address to host name translation), 
and/or its IP address 5 VVHQIS record (for information about the WHOIS service see the 
American Registry for internet Numbers website at http://www.arin,net). For example, if 
MH 110 is located in New York City, New York, USA and an IP goo-location -service 
indicates that PRD 150 is located in Los Angeles, California, USA and the text 'dsl' is 

1 0 found in the reverse DNS record of the PRD 1 50's IP address indicating It is likely a 
Digital Subscriber Line (DSL; for information about DSL see the DSL Forum website at 
http://www-dsiforum,org) s then RDS 102 can estimate that the RTT between UH 110 
and PRD 160 is in the range of 85-85 milliseconds. This is because it Is known that the 
RTF on the Internet between New York City and Los Angeles Is usually approximately 

1 5 55 milliseconds, end because if is known that a DSL usually adds an additional 10-30 
milliseconds to the RTT. 

Using; Device Configuration Status 

In another particular embodiment of the present invention, iSD~Feature and 
20 PRD-Feature are the configuration status of the ISO 104 and PRD 150, respectively. As 
used herein, the term 'configuration status' refers to the hardware and software of a 
device and the way they are customized. 

if ISO-Feature and PRD -Feature are configuration statuses that are unlikely to 
relate to the same device, then i is relatively more likely that PRO ISO and ISO 104 are 
25 distinct devices (i.e. PRD 150 is a Relay Device 108, and PRIE was being relayed). 

Similarly, if ISD-Feature and PRD-Feature are configuration statuses that are 
likely to relate to the same device, then It is relatively more likely that PRD 150 and ISD 
104 are the same device (i.e. PRIE was not being relayed). 

There are several known methods of detecting a device's conf iguration status 
30 from communications. One rnethod is to use explicit configuration information provided 
by the device In the communication* For example, HTTP clients normally include in 
each HTTP request the header Tlser-Agent 5 whien provides information on the 
operating system type and version, flue WtTP client type and version, and additional 



software installed on the device. Email applications may provide similar Information in 
the RFC 822 'X-Mailer' header, email servers may provide such information In the RFC 
822 'Received* header, and HTTP servers ^normally Include in an HTTP response the 
header 'Server' which provides information on the HTTP server type and version. 

5 Another method is to deduce the device's configumHon status from the manner it 

implements various communication protocols. The popular network administration tool 
'Nmap' includes many Implementations of this method, as described in detail in the 
article 'Remote OS detection via TCP/IP Stack Fingerprinting' written by the developer 
of Nmap (available at hlrAft^ 

1 0 For example, the article suggests examining the initial Window chosen by a TCP 

implementation, since different operating systems choose different numbers, It should 
be noted that while the article assumes that foe investigated device is responding to m 
communication, this method is also applicable to communications initiated by a device 
{although the specific indications may vary), 

15 In an example of this embodiment, where RD 108 is a SOCKS proxy, the RDS 

102 monitors an HTTP request (PRE) that includes the header 'User-Agent: fVio:ElIIa/4<0 
{compatible; MSIE 8,0; Windows NT 5.0)* - This header indicates the operating system 
is Microsoft Windows 2000, The HTTP request is sent over a TCP connection (NRIE) 
that had an Initial Window of 5840 ~ This is typical of other operating systems 

20 (speclfioally Red Hat Enterprise Linux), Since one device could not concurrently run two 
operating systems, then RDS 102 determines that ISD 104 and PRD 150 are distinct 
devices (i.e. PRD 150 Is a Relay Device 108, and the HTTP request was being relayed). 
In this example, ISP-Feature is the information that ISO 104 is running a Microsoft 
Windows 2000 operating system, and PRD~Feature is the Information that PRD 150 is 

25 running a Rod Hat Enterprise Linux operating system, 

in another example of this embodiment, where 3RD 108 is a. HTTP proxy using 
the CONNECT method, the RDS 102 monitors an HTTP request (ERIE) that Includes 
the header ! Mo2$a/5,Q (Macintosh; U; PPC Mm OS X; en) AppieWehKit/124 (KHTML, 
like Gecko) Safari/125' - This header indicates the operating system Is Apple Mac OS 

30 X. The HTTP request Is sent over a TCP connection (NRIE), RDS 102 connects to port 
80 of the source IP address of the TCP connection and sends an HTTP request. RDS 
102 then receives an HTTP response oonfoinlng the header 'Server: MicrosofMJS/5.0' 
(as explained below the original source of this HTTP response is PRD 150). Since it is 
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known thai Microsoft IIS servers do not run on the Apple Mac OS X operating system, 
then RDS 102 determines that ISO. 104 and PRO 15fS are olsti net devices lie, PRO 150 
Is a Relay Device 108 and the HTTP request torn ISO 104 was being relayed). In this 
example, ISO-Feature Is the Information that ISO 104 is running an Apple Mac OS X 
S operating system, and PRD-Feafera is the information that PRO 150 Is running a 
Microsoft IIS server. 

In some eases It is necessary to send to the original source of an information 
element an OC that will cause the original source to send back an !C that could be used 
to determine its configuration. In ths example given above where the HTTP 'Server' 
10 header is used, the HTTP request sent from ROS 102 to PRO 1 50 is an OC and the 
HTTP response is an !C. Methods of ensuring that an OC is sent to the original source 
of an information element, and methods of ensuring that an IC is received from the 
angina! source of an information element are disclosed below. 

15 Using Location, Sub-network or Administrator Information 

In another particular embodiment of the present Invention, ISO-Feature and 
PRO-Feature are the location, sub-network and/or administrator of the ISO 104 and 
PRO 150, respectively. As used herein, the term location 5 refers to the geographic 
location of a device, the term 'subnetwork' refers to the group of devices that is close to 

20 a device In the network topology, and the term 'administrator' refers to the organization 
that connects the device to the network {e.g. to the Internet). Examples of a device's 
location could be 'New York City, New York, United 'States of America' or longitude 32 
degrees 5 minutes North , latitude 34 degrees 46 minutes East'. Examples of a device's 
sub-network could he 'ail devices on the same local area network segment as the 

25 device whose IP address is 1 .2.3.4'. Examples of a device's administrator could he 
'Earthlink, Inc. of Atlanta, Georgia, USA', which Is an internet Service Provider that 
connects users to the- internet for a fee, or 'General Electric Company of Princeton, New 
Jersey, USA 5 , which is a company that connects some of its employees to the Internet 
as required for their work. 

30 A single device is not likely to concurrently have two different locations, sub- 

networks or administrators. Therefore, if ISO 184 is found to be in a different location 
and/or on a different sub-nefwo&atio/br t}as:'$.dlpre'niedmlnistrak^ than PRO 150, 
then It is relatively more likely that ISO 104 and PRO 150 are distinct devices {i.e. PRIE 



was being relayed). Similarly, if ISO 104 arid FRO 150 are found to be In the same 
location and/or on the same sub-network and/or have the same administrator have, 
than it Is relatively more likely that PRO 1l0 and ISO 104 are the same device (La PRIE 
was not being relayed), 
5 In an exemplary embodiment RD 108 is a SOCKS proxy, ISD 1 04 is an SMTP 

client and MH 110 is an SMTP server. In this case, RDS 102 uses the source IP 
address of the TCP session (an NRIE) and an IP gee-location service to estimate the 
location of the PRO 150, if then uses an RFC 822 'Date' header reported in the SMTP 
communication (a PRIE) to estimate the location of ISD 104, The 'Date 5 header 

1 0 indicates the time ('wall clock') and time zone configuration of ilia ISD 1 04, and is 

therefore an indication of its location (since devices are normally configured to the local 
time and time zone)- If the two locations do not match (e.g. the gee-location service 
Indicates that PRD 1 50 is located in New York City; while the ISO 104-s time zone is 
BMt*7% then BOS 102 determines that PRO 150 and ISD 150 are distinct devices (i.e. 

1 5 PRO 1 50 is a Relay Device 1 08 and the SMTP communication was being relayed). 

Alternative methods of estimating the location of PRD 150 (other than using an 
IP geodooation service) include but are not limited to; (a) Checking the IP address's 
WHOiS record. The address given in the WHOtS record is likely relatively close to the 
location of REP 150; and (b) Checking the IP address's reverse DNS record (e# if the 

20 record ends with Mr' then PRD ISO is likely in franca). 

Alternative methods of estimating the location of ISO 104 (other than using the 
■ Date' header in an SMTP communication) include but are not limited to: (a) (in case 
PRIE is SMTP) checking the last RFC '822 'Received' header reported in the SMTP 
communication , which should contain the time zone configuration of the SMTP server 

25 used by ISO 1 04, which is often the same m the time zone of the ISD 1 04 itself; (to) (in 
case PRIE Is HTTP) checking the HTTP header Accept- Lang uage 5 , which indicate the 
languages supported by ISD 104 (e.g. the header s Accept«Language; ru* means ISD 
104 supports content In the Russian language, and it is therefore relatively more likely 
that iSD 104 is located in Russia); and (c) checking the location of the source of another 

30 communication which was-:f8g^red by m syeniitf fie ISO 104 (an ISD Triggered 
Communication), as described in detail heiow, using any of the methods described 
above for estimating the location of PRD 150, 
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In another exemplary' embodiment HO 108 belongs to a class of relay devices 
that do not relay TCP (i.e. TCP Is a NRIE) and RDS 102 uses the source IP address of 
a TCP session to estimate the sub-network of PRO ISO, This may be done by retrieving 
the VVHOIS record, reverse DNS record or roofing database record associated with that 

5 IP address (for information about rooting databases see the RADh website at 

http://www.radb.net). This may also be done by discovering routers adjacent to PRO 
150 using Time Exceeded' ICMP messages sent In response to IP packets with small 
Time-to-Uve values, as done by the Unix utility traeeroute {or the equivalent Microsoft 
Windows utility tracer!), since devices in the same sub-network normally conned to the 

1 0 Internet 1 00 through the same routers, RDS 1 02 then estimates the sub-network of the 
ISO 104, This may he done by checking the sub-network of the source of an ISO 
Triggered Communication, as described in detail below, using any of the methods 
described above for estimating the sub-network of PRO 150. If the two sub-networks do 
not match (e#. the sub-network names in the devices* WHOIS records are different or 

1 5 the devices do not have a common adjacent router), then RDS 1 02 determines that 
PRO 1 SO and ISP 150 are mtmct devices (is. PHD 1 50 is a Relay Device 108 and the 
PRIE was being relayed). 

In another exemplary embodiment wherein RD 108 belongs to a class of relay 
20 devices that do not relay TCP (ie. TCP is a NRIE) and RDS 1 02 uses the source IP 
address of a TCP session to estimate the administrator of PRO 150. Exemplary 
methods for estimating the administrator include hut are not limited to: (a) Retrieving the 
WHOIS record associated with that IP address (the organization name given In the 
record is likely the administrator); (h) Retrieving the reverse DNS record associated with 
25 that IF address (e.g. if the record ends with 'eox.nef then the administrator is likely 'Cox 
Communications of Atlanta, Georgia, USA); fe) Retrieving the WHOIS record 
associated with the second-level domain In the fevers© DNS record associated with that 
IP address: and (d) Retrieving the routing database record associated with that IP 
address, RDS 102 then estimates the administrator of ISO 104. Methods for estimating 
30 the administrator of ISO 104 include but are not limited to: (a) (In case PRIE is HTTP) 
checking the HTTP header f U»Agenf whieh sometimes contains information about 
the administrator. For example, a e User~Ageof header containing the text 'AOL 9,0 s 
mdicates the HTML browser installed on iSD 104 was provided by America Online, Inc. 
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of Dotes, Virginia, USA (AOL). Since many internet users receive their HTML browser 
from the same organization through which they .-conned to the Internet, this increases 
the likelihood that IS0 104 connects to the Internet through AOL (is. AOL is the 
administrator). In another example, the tesst 'Cox High Speed Internet Customer 
S indicates the browser was provided by Cox Gommi?nloaions t Inc. of Atlanta, Georgia, 
USA; (b) (In case PRIE is SMTP) checking the RFC 822 'Organization 1 header, which 
sometimes contains information about the administrator; and (c) checking the 
administrator of the source IP address of an ISO Triggered Communication, as 
described in detail below, using any of the methods described above for estimating the 
1 0 administrator of PRO 150, 

ISD Triggered Communications 

information about ISD 104 may be discovered by triggering ISO 104 to send 
communications. Such a communication Is an ISO Triggered Communication*. 

W Exemplary ISD Triggered Communications that expose the IP address of ISD 

1 04 are disclosed in POT Application WO02/Q8653, the entirety of which Is herein 
Incorporated by reference. After receiving the ISD 104's IP address, RDS 102 may use 
the methods described above to find its location, sub-network or administrator. It can 
then compare them to the location, sub-network or administrator of the PRD 150, and 

20 determine whether they are different devices. Alternatively, RDS 102 can compare the 
ISD 104's IP address to the PRD ISO's IP address directly. If they are different, it Is 
relatively more likely that ISD 104 is a different device than PRD 1 SO (i.e. PRD 150 is a 
Relay Device 1 08, and PRIE was being relayed). 

Another exemplary triggered communication is a DNS request In some cases, 

25 ISO 104 will send a DNS request associated with a PRIE. In order to translate a 
hostname info an IP address. The ISD 104 will send the DNS request to the DNS 
server(s) that It is configured to use. The ISO 104's DNS server will then send a DNS 
request to the authoritative DNS server for the given hostname. RDS 102 monitors DNS 
requests sent to this authoritative DNS server, it should be noted that although the DNS 

30 request that RDS 1 02 monitors may he revived from DNS server that ISD 1 04 Is 
configured to use, ISO 104 is the original source of at least one information element 
contained in the DNS request. US Patent application OS2002018831, the entirety of 
which is herein Incorporated by reference describes methods of causing a device to 
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generate DNS requests, as well as methods of associating the DNS requests with the 
identity of the device that originated them. One method to associate a DNS request with 
the relevant PRIE is to configure the authoritative DNS server to reply with a different IP 
address to each request for translation of a hostname, and to keep a record of which 
5 DNS request received which IP address as a reply. When the PRIE Is received at one of 
the IP addresses, the associated DNS request can be retrieved from the record. 

After receiving the DNS Request from the ISO 104% DNS server, RDS 102 may 
use the methods described above to find its location, sub-network or administrator. For 
performance and economical reasons, many devices connected to the internet are 

10 configured to use a DNS server that Is in a similar location, on the same subnetwork, or 
administered by the same administrator. RDS 102 can therefore compare the location, 
subnetwork or administrator of the PR0 150, to the location, sub-network or 
administrator of the ISD 104's DNS server, respectively, if they do not match, it is 
relatively more likely that PRO 150 is a different device than ISD 104 (i.e. PRIE is being 

15 relayed}. 



Using Maximum Communication Rate 

20 in another particular embodiment of the present invention, fee maximum 

communication rate (MCR) of the PRO ISO and ISD 104 are used to determine whether 
they are the same device. The term MCR refers to the maximum amount of information 
a device can receive (incoming MCR) or send {outgoing MCR) during a time interval 
'Sits per Second' (bps) is a common measure of MCR. For example, some DSL lines 

2S have an incoming MCR of 1 .5 Mori bps (Mbps), and an outgoing MCR of 98 thousand 
bps (kbps). 

lithe MCR of the PRO 150 is different than that of the ISO 104 then it is more 
likely that. ISD 104 is a different device than PRO 150 (i.e. PRIE Is being relayed), 

in an example of this embodiment, RD 108 Is connected to the Internet 100 on a 
30 DSL line with a 1 .5 Mbps Incoming MCR and a 98 kbps outgoing MCR. Since relayed 
communication from MM 110 to ISP 104 are transferred through the incoming interface 
of RD 108 and then through its outgoing Interface, then tho Incoming MCR of the ISP 
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104 could not exceed the outgoing MM of RP 108 {98 kbps). Therefore, PRO ISO's 
incoming MCR is 15 Ubps while ISO 104% incoming MCR Is 98 kbps or less. 

In another example of Ms embodiment, ro 108 is connected to the Internet 100 
at an MCR of 20 Mbps In both directions, and ISO 104 is connected to the Internet 100 
5 on a DSL line with a 1 ,5 M bps Incoming MCR and a 96 kfops outgoing MGR. Therefore, 
PRO 150's Incoming MCR is 20 Mbps while ISO 104 ! s incoming MCR is 15 Mbps, and 
PRO 150's outgoing MCR is 20 Mbps white ISO 104's outgoing MCR is 98 kbps. 

Detecting MCR 

1 0 According to some embodiments of the present invention, checking whether the 

MCR of ISO 104 Is different than the MCR of PRD ISO Is done by detecting the MCR of 
each and comparing them. One method of detecting an MCR is sending Information to, 
or receiving information from, a device, using a protocol that automatically adjusts to the 
maximum communication rate possible, such as TCP, and observing the 

15 communication rate, For an explanation of some of the mechanisms used in TCP for 
adjusting to the MCR, sea the article MACOBSON, V. Congestion avoidance and 
control. In Proceedings of SiGCOivty '88 (Stanford, OA, Aug, 1988), ACM* and the 
articles ft references . 

Another method of observing the incoming communication rate of a device 

20 Involves monitoring receipt acknowledgements received from the device. For example, 
a device receiving information on a TCP connection sends from time to time the amount 
of information it has successfully received. Receiving an acknowledgement of byte 
number 3,000,000 at one time and an acknowledgement of byte number 3,250,000 
twenty seconds later, would indicate that the device is receiving information at a rate of 

26 12,500 Bytes per Second, which equals 100,000 Bits per Second (i.e. Its incoming MCR 
is 100 kbps), in another example, as described above, an HTML browser would 
normally generate an HTTP request immediately upon receiving the first embedded 
image in an HTML page (e.g. using the -HTML <img> tag). Receiving the HTTP request 
forty seconds after starting to send an HTML page in which the embedded Image tag is 

30 located at offset 1 ,000,000 bytes, would indicate that the device Is receiving Information 
at a rate of 25,000 Bytes per Second, which equals 200,000 Bits per Second (i,e. its 
. Incoming MCR Is 200 kbps). It should be rioted that -this measurement might be 
inaccurate since It compares the sending of the beginning of the HTML page to the 



receipt of the HTTP request theiBfey adding at least on© round trip time to the 
measurement. Measuring the round trip Mm and subtracting it from this time 
measurement may produce more accurate results. 

RD 108 normally buffers relayed commiinicaions between MH 1 10 and ISO 104. 
5 Buffers are temporary storage Into which RD 108 writes communications It receives, 
and from which RD 108 reads the communications it sends. When the buffers are not 
Ml, RD 108 can receive communications In its incoming MCR rate. When the butlers 
are full, RD 108 can receive communications at the rate If can empty the buffers (i.e. at 
the rate It can send communications out from the buffers). This means that a 

10 measurement of the communication rate from MH 1 10 to RD 108 would be an Indication 
of RD 108's incoming MCR when the buffers are not full, and an indication of ISO 104% 
incoming MCR when the buffers become full. The buffers would become full If ISD 104's 
Incoming MCR Is smaller than RD 103% Incoming MCR. For example, If RD 1 08 has 
buffers of size 100,000 bytes, and Its Incoming MCR Is 1.5 Mbps, and ISD 104's 

15 Incoming MCR is 98 kbps, then the buffers would be filied at a rate of 1.404 Mbps, It 
would fake the buffers approximately 0.57 seconds to become full at this rate, in this 
example* a measurement of the communication rate during the first 0,57 seconds would 
be 1 .5 Mbps, and this would be an indication of the RD 108's incoming MCR, and a 
measurement of the communication rate at any time after the first 0,5? seconds would 

20 be 98 Wp$ } and this would be an indication of the ISO 1 04 ! s Incoming MCR, 

Therefore, in the case where PRD 180 is RD 108, the Incoming MCR of PRD 150 
Is Indicated by PRD ISO's incoming communication rate before the buffer is full, and the 
Incoming MCR of ISD 104 is Indicated by PRD ISO's Incoming communication rate after 
the buffer Is full. 

25 In the case where PRD 1 50 is ISD 104 this buffer does not exist and PRD 1 SO's 

incoming communication rate does not change, and remains equal to ISD 104's 
Incoming MGR. 

Therefore, PRD I SO's incoming MCR: is equal to PRD ISO's Incoming 
communication rate until the time ft would fake RD 108'S buffers to be filled, and ISD 
30 1 04's Incoming MCR is equal to PRD 1 5Q's Incoming communication rate thereafter, 

RDS 102 can thus determine whether PRO 150 is a Relay Device 108 by 
comparing PRD I SO's lnc^^.^rii?n«fetej mfe'$ these two times.. 
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In many cases, the communication rate from RD 108 to MH 1 10 does not exceed 
the ISD 104's outgoing MCR, because RD 108 would send to MH 110 only the 
Information It receives from ISD 1Q4 f and would not normally generate significant 
amounts of Information by itself. However since, as described above, RD 108 buffers 
5 communications between ISD 104 and. MH 1 10, it is possible to measure the outgoing 
UCR of RD 1 08 as follows. Some protocols contain a 'flow control' mechanism, which 
allows a device receiving information to signal to a device sending the information when 
to stop sending new information and when to resume sending. For example, In TCP if a 
receiving device sends an ACK segment with a 'Window 4 value of zero, a sending 

1 0 device would normally stop sending new information, until it receives an ACK segment 
with a positive 'Window* value, A flow control' mechanism may be used by MH 1 10 or 
ROS 1 02 to signal to RD 108 to stop sending new information. This would cause RD 
1 0S's buffers to he filled by information, which RD 108 will continue to receive from ISO 
104, and would not send to MH 110. MH 110 or RDS 102 would then use the 'flow 

15 control' mechanism to signal to RD 108 to resume sending new information. Since now 
RD 108 would send Information from its local buffers, until the buffers will be empty, a 
measurement of the communication rate would be an Indication of the outgoing MCE of 
RD 108. 

According to some embodiments of the present invention, checking whether the 
20 MQR of ISD 104 is different than the MCR of PRO 150 is done by detecting an 
indication whether buffers in PRO 150 are being filled or emptied. For example, as 
described above, buffers in RD 108 would be filled if the incoming MCR of ISD 104 is 
smaller than the incoming UCR of PRO 150. In another example, as described above, 
buffers in RD 108 would be emptied if the outgoing IvlCR of ISD 104 is smaller than the 
25 outgoing WCR of PRO 150. Some protocols allow a device to advertise the capacity of 
its buffers, or a derivative thereof, to other devices it communicates with, and this 
Information may be used to estimate whether a device's buffers are filling or becoming 
empty. For example, in TCP a device would advertise the amount of information if is 
willing to receive in the 'Window' header of every ACK segment If sends. Changes in 
30 'the value of the 'Window* header normally reflect changes in the capacity of the device's 
buffer. An increase in the 'Window* value indicates the buffers are being emptied, while 
a decrease in the Window' value indleafos the buffers are being filled. 
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Therefore, a decreasing Window' value is a direct indication ffiat FRO I SO's 
incoming UCR Is larger than IS0 104% incoming MOB, meaning that PRD 150 and ISO 
104 are distinct devices (t,a PRO ISO is a R©lay Device 108). 

Another method of detecting an PGR Is measuring the two latencies of two 
5 communications, which the only difference between them is the amount of information 
contained in each, and then calculating the MCE from the difference between the two 
latencies. As described above, the time f takes for a communication to fee transferred 
over a communication line is proportional to the amount of information contained in the 
communication. Therefore, the MCR is equal to the difference between the amounts of 
10 information In the two communications, divided by the difference b&tmm the two 
latencies. 

This method can also foe used with RTT measurements, rather than latency. For 
example, the RTT of the ICMP echo mechanism, with a short Echo message (e.g. 84 
bytes), and the RTT with a long Echo message (e.g. 1500 bytes), can be used to detect 

15 the device's: MCR. St should be noted that if the incoming and outgoing commuiiloatsons 
contain the same amount of Information, as in the ICMP echo example, the RTT method 
would provide a combined measurement of both the device's incoming and outgoing 
MGR. One method! to overcome mis limitation is to make RTT measurements using a 
protocol In which only one of the incoming and outgoing communications contain large 

20 amount of information. This would reduce the contribution of the other communication's 
latency to the RTT, and thus provide a better approximation of the latency of the first 
communication. For example, a TCP SYN segment sent to a closed port would normally 
trigger a. segment containing the RST flag (RST segment). While the SYN segment can 
be of a size chosen by its sender, the RST segment would usually be small (e.g. 40 

25 bytes). 

Another method of detecting an MCR is to estimate If from other information 
known about the device. For example, finding the text 'dsf in the reverse DNS of the 
device's IP address Indicates it is likely a DSL line, which usually has an incoming MCR 
of 500-2,500 kfops and an outgoing MCR of 04-258 kbps. In another example, finding 
30 that the PRD 1 80 : s administrator is AOL indicates it is likely connected through a voice- 
modem dial-up line {since this is the primary connectivity option offered by AOL), which 
usually has an incoming MCR of up to 58 kbps and an outgoing MCR of up to 33 kfops. 
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Two Devices Using the same Relay Device 

In another particular eoibcM$ment of the present Invention, RDS 102 detects that 
PRO 150 is a Relay Device 108. by det^lnlngtbat ihe-'ortpna! sources of two 
information elements are two different devices, without requiring that one of the original 
6 sources be the PRO 150. 

In this embodiment RDS 102 attempts to distinguish between the following two 

Fig. SA provides a diagram of the first case, in which ISO 104 sends two 
Information elements, PRIE1 and PRE2, directly to Mtt 110, without using RD 108. The 

1 0 two information elements are of a type that RD 1 08 may relay. 

Fig. SB provides a diagram of the second case, in which two different devices, 
IS0 104 and ISD2 106 each send an information element io MM 1 10, wherein both 
information elements are relayed by RD 108, 

RDS 102 distinguishes between the two cases as follows; 

15 RDS 102 monitors communications of MB 110 and therefore receives from RD 

108 two potentially relayed Information elements (FRIE1 and PR1E2). RDS 102 then 
checks whether a feature (PRiEl -Feature) of an original source (PRIE2-Souroe) of 
PRIE1 and a feature (PR!E2-feature) of an original source (PRIE2-Source) of PRO 
are Incompatible features. If indeed PRIEI-Feature and PRi£2-Feature are 

20 Incompatible Features, then it is more likely that the original source of PR! El and the 
source of PRIE2 are different devices, wbsch means the second case is more likely, 
which means that PRIE1 and PRIE2 have likely been relayed; 

For simplicity reasons, the description of this embodiment contains only three 
devices, two of which use a relay device. In practice, there are many devices connected 

25 to the Internet 100, and each of them may or may not use one of several relay devices 
that are also connected to the Internet 100, Sklore generally, this embodiment defects 
that a potential relay device is a relay device by determining that at least two original 
sources of at least two potentially relayed information elements which were received 
from the potential relay device, have Incompatible Features. Since by definition a single 

30 device Is not likely to have Incompatible Restores, then it is lusty that the two original 
sources are different devices, and therefore at least one of the sources is not the 
potential relay device, which means the Information element from this source has been 
relayed and the potential relay device Is a relay device* 
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All methods described above for determining -whether features are incompatible 
Features are also applicable for this embodiment, mutatis mutandis. For example, RDS 
102 may compare the HTTP 'User^enf Headei® prarfdecl in PRIE1 and PRIE2, and if 
they are indicative of different operating systems determine that ISO 104 and ISD2 108 
5 are distinct devices and therefore PRO 1S0 Is a Belay Device 1 08. In another example, 
RDS 102 may detect that the- RTT of communications with PRI EI-Source and PRIE2- 
Source ere significantly different It should he noted, that in this case each RTF is the 
sum of (a) the RTT between RD 108 and MH 1 10, and (fa) the RTT between each 
original source and RD 108, Therefore, the RTT gap in this case is a result of 
1 0 differences between (c) the RTT between PRiEI-Souree and RD 1 08, and (d) the RTT 
between PRIE2~Souroe and RD 108, The two RTTs are not necessarily significantly 
different, but in some cases they may be, allowing RDS 102 to detect an RTT gap. 

Ensuring Communications with an Original Source 

1 s In some embodiments of the present invention, information relating to features of 

devices Is obtained from one or more new communications, and not from the original 
communication or communications that contained RRIE and/or PRIE. For such new 
communications to be usable, it should be verified that these are communications with 
the original source of the FRfE and/or NRIE, 

20 US Published Patent Application 20040243832 the entirety of which Is herein 

incorporated by reference, discloses several methods for determining that two network 
messages originated from the same sender. While these methods refer to messages 
rather than information elements or communications, and to senders rather than original 
sources, it will he appreciated by someone skilled in the art that most of the methods 

25 described therein are applicable to detecting that two information elements were sent by 
the same original source. 

In one exemplary embodiment, ICMP packets and TCP/IP packets received with 
the same source IP address or sent to the same destination IP address could be 
considered as communications with the same device. 

30 in another exemplary embodiment, incoming TCP/IP packets belonging to one 

TCP connection could be considered to come from the same device. 
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In another exemplary embodiment an HTTP response sent Mowing an HTTP 
request, in accordance with the HTTP protocol could be considered as a communication 
with the source of the HTTP request. 



5 The System 

Fig, 3 describes the components of the Relay Detection System 102 in 
accordance with a particular embodiment of the present Invention. 

An Information Element Receiver 30 monitors communications that MH 110 
receives. It potentially also monitors communications that MH 110 sends. It potentially 
1 0 also monitors communications that Outgoing Communication Sender 38 sends. 

A Feature Incompatibility Analyzer 34 determines whether ISO-Feature and 
PRDT'eature are Incompatible Features. 

An optional Feature Discovery Module 32 discovers the iSD-Feaiure and/or the 
PRD~Faature> 

IS An optional Outgoing Communication Sender 38 sends one or more Outgoing 

Cpmlnunleafoos, 

An optional Parameter Obtainer 38 obtains one or more parameters indicative of 
(a) the iSD-Faafure, (b) the PRD-Feaiure and/or (c) whether ISO-feature and PRD- 
Feature are Incompatible Features, 
20 An optional Feature Database 40 contains a list of pairs of features and whether 

they are Incompatible Features. For example, the Feature Database 40 may contain a 
description of which HTML clients are supported by each operating system. 



Miscellaneous 

2§ it is to be understood that according to some embodiments, the present Invention 

provides for using any combination of features or parameters disclosed in the present 
document to be useful for determining sf a potential relay device Is a relay device and/or 
for determining if received infomiation elements were relayed by a relay device. 

in some embodiments described above ISD-Featu re and/or PRD~Feature were 

30 related to latency and/or to eommimicaflon rate. If will be appreciated by someone 
skilled in the ad that these are specific cases of the more general case in which ISD- 
Feature and/or PRD-Feafure are related fe communication performance. 
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'Communication performance* $ ai&m&Wm. mferring to all the parameters 
describing the speed, rats, time, efficiency etc. of a oammttofcation. 

in determining whether two features are incompatible Features, RDS 102 should 
consider the time at which each of the features was discovered. For example, if the two 
5 features are wall clock configurations equal to S;05am and 1Q:0Sam, and the second 
feature was discovered exactly one hour after the first than She two features could not 
be considered Incompatible Features, because the difference in time of discovery 
directly explains the difference in the features. However, If the same two features were 
discovered at approximately the same time, they could be considered Incompatible 

1 0 f eatures. In another example, if one feature is an RTT measurement of 900 

milliseconds and the other feature is an RTT of 940 milliseconds, and the first feature 
was discovered 24 hours after the other, then the features are not necessarily 
incompatible Features, because network topology changes may account for this 
difference* However, if the same two features were discovered during the same 1 0& 

15 millisecond period, the likelihood of topology changes is smaller, and thus the features 
are more likely to be incompatible Features. 

if the features were discovered at nearly the same time, then RDS 1 02 should 
consider whether these features are unlikely to concurrently relate to the same device. 
The numerous Innovative teachings of the present application are described with 

20 particular reference to a presently disclosed embodiment. However, it should be 

understood that this class of embodiments provides only a few examples of the many 
advantageous uses of the innovative teachings herein. In general, statements made In 
the specification of the present application do not necessarily delimit any of the various 
claimed inventions. Moreover, some statements may apply to some inventive features 

25 but not to others. 

While particular embodiments of the Invention have been shown and described, it 
will be obvious to those skilled in the art that changes and modifications may be made 
without departing from the invention in Its broader aspects, and therefore, the aim in the 
appended claims is to cover ail such changes and modifications as fall within the true 
30 spirit and scope of the invention. 

According to some embodiments, the term "unlikely" as used herein refers to a 
probability of a maximal probability of at most 40%. 

In other embodiments, this maximal probability is at most 30%. 
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in other embodiments, this maximal probability is at most 20%, 

In other embodiments, this fjiaximal probabity is at most 10%. 

Irs other embodiments; this maximal probability Is at most 5%. 

In other embodiments, this maximal probability is ai most 1%, 
5 I n other embodiments, this mammal probability is at most 1 /2%. 

Similarly, a "likely" event is an event that occurs wife a probability 

of 100% minus the probability of the ^otifcefy* event. 

According to some embodiments, a "single" device is physically one device. 
Nevertheless, it is recognized in the art that electronic devices connected to data 
1 0 networks such as relay devices, monitored hosts and information source devices are 
often a duster of several physically different devices logically configured to present 
themselves to a data network and/or devices on the data network as a "single" device. 
Thus, as used herein, a "single" device refers both to a single physical device as well as 
a plurality of physical devices logically configured to present themselves as a single 
15 device. 

According to some embodiments, locations that are "different" as used herein 
refers to places located in different countries, 

According to some embodiments, locations that are "different" as used herein 
refers to places located In different states, 
20 According to some embodiments, locations that are Mffferenf as used herein 

refers to places located in different provinces. 

According to some embodiments, locations that are "different" as used herein 
refers to places located in different continents. 

According to some embodiments, locations that are "different" as used herein 
26 refers to places located in different time zones. 

According to some embodiments, legations that are "different" as used herein 
refers to location separated by a minimum of 100 kilometers, 

or by a minimum of aOOMidmeters, or by a minimum of 1000 kilometers, or by a 
minimum of 2500 kilometers, 
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WHAT IS CLAIMED IS: 

1} A method of determining whether a poimffiM relay device Is a relay device, the 
method comprising; 

a) receiving first and second Information elements from the potential relay 
device, wherein the potential relay device is an original source of said 
second information element; and 

h) determining whether a feature of an original source of said first 
information element and a feature of the potential relay device are 
features unlikely to relate to a single device:, 

wherein a positive result of said determining is indicative that the potential relay 
device is a relay device. 
2} The method of claim 1 wherein said second information element is of a type that 
a relay device of a class of relay devices is unlikely to relay. 

3) The method of claim 2 wherein said class of relay devices is selected from the 
group consisting of a SOCKS proxy, an HTTP proxy using the GET method, an 
HTTP proxy using the CONNECT method, an IP router and a Network Address 
Translation device, 

4) The method of claim 1 wherein said second information element is part of a 
communication, wherein the communication is of a type selected from the group 
consisting of IP, TCP, ICMP, DNS, HTTP, SMTP, TLS, and SSL 

5} The method of claim 1 wherein said first Information element is part of a 

communication, wherein the communication is of a type selected from the group 
consisting of IP, TCP, ICMP, DNS, HTTP, SMTP, TLS, and SSL. 

8} The method of claim 1 wherein said first and said second information elements 
are parts of a single communication, 

7} The method of claim 8 wherein said first and said second information elements 
are sent In two different layers of a protocol stack. 

8) The method of claim 1 wherein said stage of determining comprises:. 

i) discovering said feature of an original source of said first information 
element; and 

if) discovering said feature of the potential relay device. 

9) The method of claim 8 wherein said stage of determining further comprises; 
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iff) comparing said feature of an original source of said first information 
element with said feature of the potsotlai relay device, 
1 0) The met hod of claim 8 further comprising: 

c) obtaining a parameter iridicaiye of said feature of an original source of 
said first information element; and 

d) obtaining a parameter indicative of said feature of the potential relay 
device, 

1 1 } The method of claim 8 wherein said stage of determining further comprises: 
ill) considering a time at which at least one of said feature of an original 
source of said first information element and said feature of the potential 
relay device, was discovered, 

12} The method of claim 1 further comprising: 

o) obtaining a parameter Indicative of a relationship between said feature 
of said original source of said first information element and said feature of 
the potential relay device. 

13) The method of claim 12, wherein said stage of determining includes analyzing 
said parameter indicative of a relationship between said feature of said original 
source of said first information element end said feature of the potential relay 
device. 

14) The method of claim 12 wherein said parameter is obtained from at least one of 
said first information element and said second information element, 

I S) The method of claim 1 further comprising; 

c) sending an outgoing communication to at least one of said original 
source of said first Information element and the potential relay device; 
and 

d) Receiving a third information element from said at least one of said 
original source of said first Information element and the potential relay 
device, 

1 8) The method of claim 15, further Including: 

e) deriving from said third information element information related to a 
feature of said at least one of said original source of said first 
Information element and the potential reiay device. 
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1 7) The method of claim 1 5 further cemphsine.;: 

ill) verifying that an original source of said third information element Is said 
original source of said first information element 
1 8} The method of claim 1 5 further comprising: 

ill) verifying thai an original .-source of said third information element is the 
potential relay device. 
1 9} The method of claim 15 wherein said third information element is selected from 
the group consisting of an SGMP message, an \CMP Echo Reply message, a 
DNS query, an HTTP request, an HTTP response, an HTTP 'Server 5 header, an 
IP address, a TCP port, a TCP Initial Sequence number, a TCP Initial Window, a 
WHOIS record, and a reverse DNS record, 
20) The method of claim 1 wherein at least one of said feature of an original source 
of said first Information element and said ie-atum of the potential relay device Is a 
feature related to a configuration status. 
21 } The method of claim 20 wherein said feature related to a configuration status Is 
selected from the group consisting of an operating system type, an operating 
system version, a software type, m HTTP client type, an HTTP server type, an 
SMTP client type, an SMTP server type, a time setting, a clock setting and a time 
zona setting, 

22) The method of claim 21 wherein said determining includes examining a 
parameter Indicative of said feature related to a configuration status, 

23} The method of claim 21 wherein said parameter Is selected from the group 

consisting of an HTTP 'User-Agent header, an RFC S22 'X~Mal!er s header, An 
RFC 822 'Received' header, An RFC 822 'Date* header, a protocol 
implementation manner, a TCP/IP stack fingerprint, an IP address, a TC P port, a 
TCP initial sequence numher, and e TCP initial window. 

24} The method of claim 1 wherein at teast one of said feature of a source of said 
first information element and said feature of the potential relay device Is a feature 
related to communication performance. 

25} The method of claim 24 wherein said feature related to communication 

performance is selected from the group consisting of a measured communication 



performance, a measured relative oomrmmfcation performance, and an 
estimated communication performance. 

26) The method of claim 24 wherein said feature related to communication 

performance Is selected from the group osnsfeiing of a latency of communication, 
a latency of an incoming communication, a latency of an outgoing 
communication, a round trip time of a communication, a communication rate, an 
incoming communication rate, an outgoing communication rats, a maximum 
communication rate, an Incoming maximum communication rata, and an 
outgoing maximum communication rate. 

27} The method of claim 24 wherein said determining includes examining a 

parameter indicative of said feature related to communication performance, 

28} The method of claim 27 wherein said parameter Is selected from the group 
consisting of time of receipt of an information element, time of sending of an 
Information element, a round trip time, a round trip time gap, an IS* address, a 
Whois record, a reverse DNS record, and a rate of acknowledged Information, 

20} The method of claim 28 wherein a higher round trip time gap is Indicative of a 
higher likelihood that a relay device is being used for malicious purposes, 

30) The method of claim 24, wherein said feature related to communication 
performance is estimated from Information about at least one of said original 
source of said first communication and the potential relay device. 

31) The method of claim 30, wherein said information about at least one of said 
original source of said first communication and the potential relay device is 
selected from the group consisting of a location of a device, a hostname of a 
device, and an administrator of a device. 

32) The method of claim 1 wherein at least one of said feature of an original source 
of said first information element and said feature of the potential relay device la 
selected from the group consisting of a sub-network, an administrator, and a 
location, 

33) The method of claim 32 wherein said determining includes examining a 
parameter Indicative of at least one of said feature of a source of said first 
communication and said feature of a source of said second communication, and 
said parameter Is selected from the group consisting of an HTTP 'Usef-Agenf 



header, an RFC 822 %-MaHar* header, an RFC 322 'Received' header, an RFC 
822 'Date' Header, an IP address* a WHOtS record, and a reverse DNS record, 
34} A method of determining whether a potential relay device Is a relay device, the 
method comprising: 

a) receiving first and second Infomiatiorvelemems from the potential relay device, 
wherein the potential relay device is an original source of said second information 
element; and 

D) analyzing a configuration status of an original source of at least on© of said 
first and said second information elements, said configuration status selected 
from the group consisting of an operating system type, an operating system 
version, a software type, an HTTP client type, an HTTP server type, an SMTP 
client type, an SMTP server type, a time setting, a clock setting, and a time zone 
setting. ? 

35) A method of determining whether a potential relay device is a relay device, the 
method comprising: 

a) receiving first and second Information elements from the potential relay device* 
wherein the potential relay device is an original source of said second Information 
element; and 

b) analysing a feature related to communication performance of an original 
source of at least one of said first and said second Information elements. 

38} The method of claim 35, wherein said feature related to communication 

performance is selected from the group consisting of a latency of communication, 
a latency of an Incoming communication, a latency of an outgoing 
communication, a round trip time of a communication, a communication rate, an 
incoming communication rate, an outgoing communication rats, a maximum 
communication rate, an incoming maximum communication rate, and an 
outgoing maximum communication rate, 

37) A method of determining whether a potential relay device is a relay device, the 
method comprising'. 

a) sending a message to an information source device, triggering said 
information source device to send a DNS reguest; 
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b) determining from said DNS request whether said potential relay devise is a 
relay device. 

38} A method of determining whether a potential relay device Is a relay device, the 
method comprising: 

a) receiving first and second information elements from the potential relay 

device; and 

b) determining whether a feature of an original source of said first 
information element and a feature of an original source of said second 
Information element are features unlikely to relate to a single device, 

wherein a positive result of said determining Is Indicative that the potential relay 
device Is a relay device. 
39} A method of determining whether a potential relay device is a relay device, the 
method comprising: 

a) receiving first and second information elements from the potential relay 
device, wherein the potential relay device is an original source of said second 
information element; and 

b) checking whether a round-trip time to the potential relay device Is significantly 
different than a round-trip time to an original source of said first information 
element. 

40) A method of determining whether a potential relay device Is a relay device, the 
method comprising: 

a) receiving first and second information elements from the potential relay 
device, wherein the potential relay device is an original source of said 
second Information element; and 

h) checking whether an operating system of the potential relay device is 
different than an operating system of an original source of said first 
information element 

41) A method of determining whether a potential relay device is a relay device, the 
method comprising: 

a) receiving first and second information elements from the potential relay 
device, wherein the potential relay device Is an original source of said 
second information element; and 



b) checking whether a location of fie potential relay device Is different 
than a location of an original source of said first information element. 

42) A method of determining whether a potential relay device is a relay device, the 
method comprising: 

a) receiving first and second information elements from the potential relay 
device, wherein the potential relay device is an original source of said 
second information element; and 

b) checking whether an administrator of the potential relay device Is 
different than an administrator of an original source of said first information 
element. 

43) A method of determining whether a potential relay device is a relay device, the 
method comprising: 

a) determining whether a feature of an original source of a first Information 
element and a feature of the potential relay device are features unlikely to 
relate to a single device, 

wherein the potential relay device is a transmitter of saki first information element 

and of a second information element, 

wherein the potential relay device Is an original source of said second Information 
element 

wherein a positive result of said determining is Indicative that the potential relay 
device is a relay device 

44) A system for determining whether a potential relay device Is a relay device, the 
system comprising: 

a) an information element receive^ for receiving information elements 
from a plurality of devices including an information source device and the 
potential relay device; and 

b) a feature Incompatibility analyzer, for determining whether a feature of 
said Information source device and a feature of the potential relay device 
are features unlikely to relate to a single device. 

45) The system of claim 44 further comprising: 



c) a feature discovery module, for discovering at teas! one feature 
selected from the group consisting of a feature : of said information source 
device and a feature of the potential relay device. 
48} The system of claim 44, wherein said information element receiver Is 

further configured to receive information elements from a monitored host, 

47) The system of claim 44, wherein further comprising: 

c) an outgoing Information element sender. 

48) The system of claim 44, further comprising: 

c) a parameter obtained for obtaining at. least one parameter selected 
from the group consisting of a parameter indicative of a feature of an 
information source device, a parameter indicative of a feature of the 
potential relay device, and a parameter indicative of whether a feature of 
said information source device and a feature of said potential relay device 
are features unlikely to relate to a single device, 

49} The system of claim 44, further comprising: 

e) a feature database for storing a map between pairs of features and data 
indicative of whether said pairs of features are incompatible features, 

SO) Computer software, residing on a computer-readable storage medium, 

comprising instructions for causing a computer to: 

a) receive first and second Information elements from a potential 
relay device, wherein the potential relay device is an original source 
of said second Information element; and 

b) determine whether a feature of an original source of said first 
information element and a feature of said potential relay device are 
features unlikely to relate to a single device, 

wherein a positive result of said determining is indicative that said 
potential relay device Is a relay device. 
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